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ABSTRACT 



Symbiodinium, a large group of dinoflagellates, live in symbiosis with marine pro- 
tists, invertebrate metazoans, and free-living in the environment. Symbiodinium are 
functionally variable and play critical energetic roles in symbiosis. Our knowledge of 
Symbiodinium has been historically constrained by the limited number of molecular 
markers available to study evolution in the genus. Here we compare six functional 
genes, representing three cellular compartments, in the nine known Symbiodinium 
lineages. Despite striking similarities among the single gene phylogenies from distinct 
organelles, none were evolutionarily identical. A fully concatenated reconstruction, 
however, yielded a well-resolved topology identical to the current benchmark nr28S 
gene. Evolutionary rates differed among cellular compartments and clades, a pattern 
largely driven by higher rates of evolution in the chloroplast genes of Symbiodinium 
clades D2 and I. The rapid rates of evolution observed amongst these relatively 
uncommon Symbiodinium lineages in the functionally critical chloroplast may trans- 
late into potential innovation for the symbiosis. The multi-gene analysis highlights 
the potential power of assessing genome-wide evolutionary patterns using recent 
advances in sequencing technology and emphasizes the importance of integrating 
ecological data with more comprehensive sampling of free-living and symbiotic Sym- 
biodinium in assessing the evolutionary adaptation of this enigmatic dino flagellate. 



Subjects Ecology, Evolutionary Studies, Marine Biology, Molecular Biology 
Keywords Symbiosis, Chloroplast, Rarity, Evolutionary rates, Mitochondria, Nuclear, 
Dinoflagellate, Symbiodinium, Multi-gene analysis 



INTRODUCTION 

Dinoflagellates in the genus Symbiodinium are essential components of coral reef 
ecosystems in their role as photosynthetic endosymbionts of a myriad of marine organisms 
belonging to at least five distinct phyla: Foraminifera, Porifera, Cnidaria, Mollusca, 
and Platyhelminthes (Trench, 1993). Although highly predominant within benthic 
hosts, symbiotic associations have also been reported in the pelagic medusa Cotylorhiza 
tuberculata (Astorga, Ruiz & Prieto, 2012). Perhaps best known for their relationship with 
scleractinian corals, Symbiodinium spp. underpin the productivity and calcification that 
creates coral skeletons and the structures known as coral reefs that serve as habitat for the 
immense biodiversity these coastal ecosystems support. 
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Research conducted during the last two decades has allowed extensive genotyping of 
endosymbiotic Symbiodinium in both the Western Atlantic and Indo-Pacific Oceans and 
across benthic host taxa at a variety of spatial and temporal scales (reviewed in Coffroth 
& Santos, 2005; Franklin et al, 2012; Stat, Carter & Hoegh-Guldberg, 2006). Several recent 
studies have also begun to describe Symbiodinium diversity in free-living environments, 
including the water column {Manning & Gates, 2008; Pochon et al., 2010; Takabayashi et 
al, 2012), sediments {Pochon et al., 2010; Porto et al., 2008; Takabayashi et al., 2012), coral 
sand {Hirose et al, 2008), coral rubble {Coffroth et al, 2006), on the surface of macroalgal 
beds {Porto etal, 2008; Venera-Ponton etal, 2010), and in fish feces {Castro-Sanguino & 
Sanchez, 2012; Porto etal, 2008). 

Historically, the pioneering work of Rowan & Powers (1992) divided the genus 
Symbiodinium into three phylogenetic groups referred to as clades A-C using nuclear 
small subunit ribosomal {nrl8S) sequences. Despite the conserved nature of this marker, 
sequence variation between clades is comparable to other dinoflagellate taxa placed in 
different orders {Rowan & Powers, 1992). Later, the use of more variable nuclear large 
subunit ribosomal {nr28S) sequences was applied across broader host taxa and geographic 
scales {Santos et al, 2002; Pawlowski et al, 2001; reviewed in Baker, 2003), and ultimately 
led to the molecular classification of Symbiodinium into nine lineages {Pochon & Gates, 
2010), clades A through I (Table 1). Clades D, F, and G have been further divided into 
sub-clades D1-D2, F2-F5, and G1-G2, respectively, using nr28S and the chloroplast 
large subunit ribosomal DNA {cp23S) domain V {Hill et al, 2011; Pochon, Lajeunesse & 
Pawlowski, 2004; Pochon et al, 2006). Comparative phylogenetic reconstructions have 
yielded similar evolutionary relationships among Symbiodinium clades using nr28S and 
cp23S genes {Santos et al, 2002; Pochon & Gates, 2010), as well as when using the coding 
region of the plastid-encoded photosystem II protein Dl {psbA; Takishita et al, 2003), 
the mitochondrial cytochrome oxidase I {col; Takabayashi, Santos & Cook, 2004), and 
mitochondrial cytochrome b {cob; Zhang, Bhattacharya & Lin, 2005; Sampayo, Dove 
& Lajeunesse, 2009). However, compared to other markers the nr28S gene typically 
yields better-resolved phylogenies and is therefore considered as a 'benchmark gene' for 
clade-level analysis of Symbiodinium {Pochon et al, 2012). 

The nine existing clades and eight sub-clades of Symbiodinium have been largely 
delineated based on host-symbiont associations (Table 1). For example, clades A, B, C, 
and Dl most commonly associate with Molluscan and Cnidarian hosts, clades B, D2, and 
G2 with Poriferan hosts, and clades F, Gl, H, and I with Foraminifera. To date, the majority 
of Symbiodinium clades have also been found in the free-living environment (Table 1), par- 
ticularly clades A and B which appear to contain a high number of unique strains that may 
be exclusively adapted to a free-living mode of life {Coffroth et al, 2006; Hirose et al, 2008; 
Takabayashi etal, 2012; Yamashita & Koike, 2013). However, representatives from all clades 
are likely to be soon characterized from the free-living environment as novel sequencing 
technologies now provide researchers with unprecedented screening sensitivity and ability 
to quickly design novel Symbiodinium-specific markers with increased resolution. To 
date, a number of high- resolution markers have been employed for fine-scale studies 
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Table 1 Summary of existing Symbiodinium lineages. The nine clades (A-I) and eight sub-clades (D1-D2, F2-F5, and G1-G2) that constitute the 
genus Symbiodinium, with selected literature highlighting the habitat prevalence/preference of each lineage. 



Clade/ 


in hospite/iree- 


Habitat 


References 


Sub-clade 


living 


Preferences/Prevalence 




A 


in hospite 


Cnidaria 


(Lajeunesse, 2001; Reimer et al, 2006; Stat, Morris & Gates, 2008) 




in hospite 


Mollusca 


(Baillie, Belda-Baillie & Maruyama, 2000; Ishikura et al, 2004; Lajeunesse etal, 2010) 




in hospite 


Plathyelminthes 


(Baillie, Belda-Baillie & Maruyama, 2000) 




free-living 


Water column 


(Manning & Gates, 2008; Pochon & Gates, 2010; Takabayashi et al, 2012) 




free-living 


Sediment 


(Pochon & Gates, 2010; Porto et al, 2008; Takabayashi et al, 2012) 




free-living 


Reef sand/rubbles 


(Coffroth et al, 2006; Hirose et al, 2008) 




free-living 


Macroalgal beds 


(Porto et al, 2008) 




free-living 


Fish feces 


(Castro-Sanguino & Sanchez, 2012; Porto et al, 2008) 


B 


in hospite 


Cnidaria 


(Coffroth, Santos & Goulet, 2001; Lajeunesse, 2001; Santos, Taylor & Coffroth, 2001) 




in hospite 


Mollusca 


(Lajeunesse, 2002) 




in hospite 


Porifera 


(Hunter, Lajeunesse & Santos, 2007) 




free-living 


Water column 


(Manning & Gates, 2008; Pochon & Gates, 2010; Takabayashi et al, 2012) 




free-living 


Sediment 


(Pochon & Gates, 2010; Porto et al, 2008; Takabayashi et al, 2012) 




free-living 


Reef rubbles 


(Coffroth et al, 2006) 




free-living 


Macroalgal beds 


(Porto et al, 2008) 




free-living 


Fish feces 


(Castro-Sanguino & Sanchez, 2012; Porto et al, 2008) 


C 


in hospite 


Foraminifera 


(Pochon et al, 2001; Pochon et al, 2006; Pochon et al, 2007; Pochon, Lajeunesse & 
Pawlowski, 2004) 




in hospite 


Cnidaria 


(Coffroth & Santos, 2005; Lajeunesse, 2005; Sampayo et al, 2007; Wagner et al, 201 1 ) 




in hospite 


Mollusca 


(Baillie, Belda-Baillie & Maruyama, 2000; Ishikura et al, 2004; Lajeunesse et al, 201 0) 




in hospite 


Plathyelminthes 


(Baillie, Belda-Baillie & Maruyama, 2000) 




free-living 


Water column 


(Manning & Gates, 2008; Pochon & Gates, 2010; Takabayashi et al, 2012) 




free-living 


Sediment 


(Pochon & Gates, 2010; Porto et al, 2008; Takabayashi et al, 2012) 




free-living 


Macroalgal beds 


(Porto etal, 2008; Venera-Ponton etal, 2010) 


Dl 


in hospite 


Cnidaria 


(Brown et al, 2000; Correa & Baker, 2009; Jones et al, 2008) 




in hospite 


Mollusca 


(Ishikura et al, 2004; Lajeunesse et al, 2010) 




free-living 


Water column 


(Manning & Gates, 2008; Takabayashi et al, 2012) 


D2 


in hospite 


Foraminifera 


(Pochon etal, 2007; Garcia-Cuetos, Pochon & Pawlowski, 2005) 




in hospite 


Porifera 


(Carlos et al, 1999) 


E 


in hospite 


Cnidaria 


(Lajeunesse & Trench, 2000; Lajeunesse, 2001) 




free-living 


Water column 


(Carlos et al, 1999; Gou et al, 2003; Santos, 2004) 


F2 


in hospite 


Foraminifera 


(Pochon et al, 2001; Pochon et al, 2006; Pochon et al, 2007; Pochon & Gates, 2010) 




in hospite 


Cnidaria 


(Rodriguez-Lanetty, Cha & Song, 2002) 


F3 


in hospite 


Foraminifera 


(Pochon et al, 2001; Pochon et al, 2006; Pochon et al, 2007; Pochon & Gates, 2010) 


F4 


in hospite 


■ ■ - ■ r 

Foraminifera 


(Pochon et al, 2001; Pochon et al., 2006; Pochon et al., 2007; Pochon & Gates, 2010) 


F5 


in hospite 


Foraminifera 


(Pochon et al, 2001; Pochon et al, 2006; Pochon et al, 2007; Pochon & Gates, 2010) 


Gl 


in hospite 


Foraminifera 


(Pochon et al, 2001; Pochon et al, 2006; Pochon et al, 2007; Pochon & Gates, 2010) 


G2 


in hospite 


Cnidaria 


(Bo et al, 2011; van Oppen et al, 2005) 




in hospite 


Porifera 


(Schoenberg & Loh, 2005; Schoenberg et al, 2008; Hill et al, 201 1 ) 




free-living 


Water column 


(Takabayashi et al, 2012) 




free-living 


Sediment 


(Takabayashi et al, 2012) 




free-living 


Fish feces 


(Castro-Sanguino & Sanchez, 2012) 


H 


in hospite 


Foraminifera 


(Pochon et al, 2001; Pochon et al, 2006; Pochon et al, 2007; Pochon & Gates, 2010) 




free-living 


Water column 


(Manning & Gates, 2008) 


I 


in hospite 


Foraminifera 


(Pochon & Gates, 2010) 
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investigating the biogeography, host specificity, physiology, and ecological partitioning of 
specific strains within Symbiodinium clades, including microsatellite loci (Thornhill et al, 
2009), the Internal Transcribed Spacer regions 1 and 2 of the nuclear ribosomal DNA (van 
Oppen etal, 2005; Iglesias-Prieto et al, 2004), and recently the non-coding region of psbA 
(Lajeunesse & Thornhill, 2011; Thornhill et al, 2014). However, none of these markers have 
yet been successfully employed for characterizing free-living populations of Symbiodinium 
due to clear challenges of specifically targeting Symbiodinium against the backdrop of the 
complex micro- and meio-eukaryotic diversity found in environmental samples. In an at- 
tempt to characterize novel markers for symbiotic and free-living Symbiodinium, Pochon et 
al. (2012) used available Expressed Sequence Tags (EST) libraries for Symbiodinium (Leg- 
gat et al, 2007; Voolstra et al, 2008), to identify 84 candidate genes, and perform in-depth 
phylogenetic analyses of four relatively fast evolving genes (col, calmodulin, rad24, and 
actin). Other more conserved genes, including the elongation factor 2 (elf2) and the cob 
genes, were also recovered from EST libraries and sequenced for future clade-level analysis. 

Our current understanding on the divergence pattern and evolution of Symbiodinium 
clades is relatively limited. A standard molecular clock using nr28S sequence data, 
suggested that the ancestor of the Symbiodinium species complex evolved during the K-T 
boundary (65 MYA) in warm tropical waters (Tchernov et al, 2004), which corresponds 
to a major transition time from the extinct Mesozoic rudist-based reefs, to the modern 
scleractinian-dominated reefs. Pochon & Pawlowski (2006) later employed a relaxed 
molecular clock approach with nr28S data and suggested that Symbiodinium clades started 
to diversify from ancestral clade A some 50 MYA, in the beginning of Eocene. Their analysis 
revealed that the major diversifications of clades occurred during global cooling periods: 
the origination of Symbiodinium clades A, B, D, E, and G during the Eocene cooling, fol- 
lowed by a massive radiation that took place in all lineages since mid-Miocene (15 MYA) . 

To improve our understanding of Symbiodinium clade evolution, in this study, 
we present a 'clade-level' multi-gene analysis incorporating samples from all known 
Symbiodinium clades and sub-clades (Table 2). We selected two genes from three distinct 
organelles (nucleus: nr28S & elf2; chloroplast: cp23S & psbA; and mitochondria: col & 
cob) to test the following hypotheses: (1) single gene phylogenies will yield statistically 
distinct clades relationships; (2) A six-gene concatenated tree will be statistically different 
from benchmark nr28S; and (3) Pair-wise relative substitution rate analyses will reveal 
compartment-specific differences in evolutionary rates among Symbiodinium clades and 
organelles. Our results are integrated within the current state of knowledge of free-living 
and endosymbiotic Symbiodinium lineages (Table 1) and may serve as a basis for future 
studies investigating evolutionary implications of rarity and symbiotic/ free-living lifestyles 
among Symbiodinium dinoflagellates. 

MATERIALS AND METHODS 
DNA samples 

Thirty-four DNA samples encompassing all known Symbiodinium clades (A-I) and 
sub-clades (F2-F5; D1-D2; G1-G2) were selected for phylogenetic analyses (Table 2). 
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These samples included fifteen axenic Symbiodinium cultures belonging to five clades/sub- 
clades (A, B, D, E, and F5), seventeen samples originally isolated from symbiotic soritid 
foraminifera (Pochon et al, 2007; Pochon & Gates, 2010) belonging to six Symbiodinium 
clades/sub-clades (C, D2, F2-F4, Gl, H, and I), and two samples extracted from the sym- 
biotic bioeroding sponge genus Cliona and belonging to Symbiodinium sub-clade G2 (see 
Bo et al.,2011; Hill et al, 201 1 ) . Additionally, three cultured dinofiagellates, Gymnodinium 
simplex [CCMP 419], Pelagodinium beii(Siano etal, 2010), and Polarella glacialis [CCMP 
1383] were used as outgroups in our analyses following Pochon et al. (2012). 

Genes selection, DNA extraction and sequencing 

Six genes from three organelles were chosen for phylogenetic analyses. These include 
two nuclear genes (1) nr28S (D1-D3 region) [920 base-pairs] and (2) elf2 [473 bp]; 
two chloroplast genes (3) cp23S (Domain V) [647 bp] and (4) the coding region of 
psbA [700 bp]; and two mitochondrial genes (5) col [1057 bp] and (6) cob [906 bp]. 
Sequences for analysis were gathered from 26 samples obtained from a previous study 
(Pochon et al., 2012), nine DNA samples were extracted and partially analyzed in other 
studies (Pochon et al., 2007; Pochon & Gates, 2010) and further sequenced here to cover 
all genes using the primers and PCR cycling conditions described in Pochon et al. (2012), 
and two DNA samples were extracted from sponge tissues of the genus Cliona (courtesy 
of C. Schoenberg) and sequenced for all genes following Pochon et al. (2012) (see Table 2). 
The psbA gene was not reported in Pochon et al. (2012) and was PCR amplified in this 
study using the forward primer psbA.1.0 ( 5' -CWGTAGATATTGATGGWATAAGAGA- 3' ) 
located at the 5' end of the coding region and the reverse primer psbA_3.0 (5'- 
TTGAAAGCCATTGTYCTTACTCC-3') located approximately 700 bp downstream from 
the 5' end and using standard thermocycling conditions with an annealing temperature 
of 52 °C. All sequences were obtained by direct sequencing, except for nr28S and cp23S 
sequences, which were cloned prior to sequencing in Pochon et al. (2012), and a single 
sequence per sample included in the present study. In all cases, the variability between 
cloned sequences of any given sample was minimal (e.g., see Figure SI of Pochon et al, 
2012), ranging between 0 and 4 bp difference (data not shown). However, sequences 
showing the shortest branch length in each sample were selected (data not shown). In cases 
where several sequences showed the same short branch length, one sequence was randomly 
chosen among them and included in the analysis. 

Phylogenetic analyses 

DNA sequences were inspected and assembled using Sequencher v4.7 (Gene Codes 
Corporation, Ann Arbor, MI, USA) and manually aligned with BioEdit v5.0.9 sequence 
alignment software (Hall, 1999). Thirteen distinct DNA alignments were generated: 
six alignments corresponding to individual gene alignments, one fully concatenated 
alignment of all six genes (ALL Concat), and six partially concatenated alignments 
including all possibilities of five genes each (i.e., each alignment excluded one of the six 
gene candidates). Concatenated alignments were created using the 'join sequence files' 
option in TREEFINDERvl2. 2.0 (Jobb, von Haeseler & Strimmer, 2004). elf2 was included in 
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these analyses despite two missing samples (see samples #27 and #30; Table 2), which were 
coded as missing data in all concatenated alignments. GenBank accession numbers for all 
investigated sequences are shown in Table 2. 

Each DNA alignment was analyzed independently under both Maximum-likelihood 
(ML) and Bayesian environments. Best-fit models of evolution were estimated for each 
alignment (see Table SI) usingModeltestv3.7 (Posada & Crandall, 1998). ML analyses were 
carried out using PhyML v3.0 (Guindon et al, 2009), and the reliability of internal branches 
was assessed using 100 bootstraps with subtree pruning-regrafting branch swapping. 
Bayesian tree reconstructions with posterior probabilities were inferred using MrBayes 
v3.2 (Ronquist et al, 2012), using the same model of DNA evolution as for the ML analyses. 
Four simultaneous Markov chains were run for 1,000,000 generations with trees sampled 
every 10 generations, with 50,000 initial trees discarded as "burn-in", based on visual 
inspections. Concatenated alignments were run under ML and Bayesian environments as 
described above, with the alignments partitioned so that the specific model of evolution 
corresponded to each gene fragment. 

Topological tests, rate calculations, and statistical analyses 

To compare the topology of the various trees, approximately unbiased (AU) topological 
congruency tests (Shimodaira, 2002) were performed using site likelihood calculation in 
RaxML v7.2.5 (Stamatakis, 2006), followed by AU tests using CONSEL (Shimodaira & 
Hasegawa, 2001) with default scaling and replicate values. elf2 was excluded from the single 
gene analyses due to missing data (samples #27 and #30; Table 2), but was included in the 
concatenated analyses (see above). 

In order to determine evolutionary rates among Symbiodinium lineages for each of 
the six investigated genes, relative-rate tests (RRT) were performed using the program 
RRTREE vl.l (Robinson-Rechavi &Huchon, 2000). Clades and sub-clades were compared 
in a pair- wise fashion with G. simplex as the outgroup. Relative rates of evolution (K-scores 
from RRTREE analysis above) were compared among clades and among cellular organelles 
using a two way AN OVA, followed by post hoc analysis with Tukey's Honestly Significant 
Difference (THSD) test. 

RESULTS 

DNA alignments for the six investigated genes ranged between 473 (elf2) and 1,057 
bp (col). Individual phylogenies were generated (Fig. 1), and each was compared to 
the topology obtained with the nr28S gene, which is the current molecular taxonomic 
benchmark for the clade-level classification of Symbiodinium (Hill et al, 2011; Pochon & 
Gates, 2010; Pochon et al, 2012). Overall, the cladal relationships were remarkably similar 
among the genes investigated, particularly the basal positions of clades A, D, E and G, and 
the derived positions of clades B, C, F, H, and I. Symbiodinium clades were relatively well 
resolved in the nuclear and chloroplastic genes, but not the mitochondrial genes, which 
placed clades C, F, and H in completely unresolved monophyletic groups (see Figs. IE 
and IF). However, with the exception of nr28S, the relationships amongst clades were 
weakly supported for all markers, especially in the higher parts of the trees, and this was 
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Figure 1 Single-gene phylogenies of Symbiodinium using two genes from three organelles. Best Max- 
imum likelihood (ML) topologies for Symbiodinium clades and sub-clades A to I based on the nuclear 
genes (A) nr28S and (B) elf2, the chloroplastic genes (C) cp23S and (D) psbA, and the mitochondrial 
genes (E) col and (F) cob. Numbers in brackets refer to the Symbiodinium strains detailed in Table 2. 
Numbers at nodes represent the ML bootstrap pseudoreplicate (BP) values (underlined numbers; 100 BP 
performed) and Bayesian posterior probabilities (BiPP). Black dots represent nodes with <95% BP and 
BiPP of 1.0. Nodes without numbers correspond to BP and BiPP lower than 70% and 0.8, respectively. 
Nodes displaying BP lower than 50% were manually collapsed. The phylograms were rooted using the 
dinoflagellates Gymnodinium simplex, Pelagodinium beii, and/or Polarella glacialis. GenBank accession 
numbers are given in Table 2. Note: All clades are represented, except for clade E in the elf2 phylogeny. 
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particularly evident for psbA where relationships between clades B, C, D, F, G, H, and I 
were completely unresolved (Fig. ID). Furthermore, the relationships between sub-clades 
within clades D, F, and G showed contrasting results. Well-supported monophyly of all 
sub-clades was only observed in the nr28S gene (Fig.lA). Notably however, clade G 
sub-clades (Gl and G2) formed a monophyletic group across all genes. In contrast, the 
monophyly of clade F and clade D sub-clades was only resolved with nr28S (Fig. 1 A) and 
nr28S and cob (Figs. 1A and IF), respectively. All Symbiodinium strains belonging to the 
same sub-clade grouped together across all genes, with two noteworthy exceptions. First, 
the four samples of sub-clade F5 (#14-16) separated into two groups in cob (Fig. IF). 
Second, sample #24 (Table 2) of sub-clade D2 diverged significantly to the root of the tree 
in cp23S (Fig. 1C). 

In order to increase the phylogenetic signal and assess which of the individual markers 
best reflects the most well resolved evolutionary history of Symbiodinium, a series of gene 
concatenation analyses were conducted. In total, seven distinct concatenated alignments 
were analyzed, including one fully concatenated alignment of all six genes (ALL Concat) 
consisting of a total length of 4,703 bp, and six partially concatenated alignments ranging 
in length from 3,646 bp (ALL except col) and 4,230 bp (ALL except elf2), and including 
all possibilities of five genes each (see Methods). Phylogenetic analysis of the fully 
concatenated dataset (ALL Concat, Fig. 2) resulted in a highly resolved Symbiodinium 
tree with identical topology to nr28S gene, but with much stronger phylogenetic signal 
as evidenced by a significant increase in statistical support at all nodes (Fig. 2). Other 
concatenated alignments yielded weaker nodes support and unstable cladal relationships 
globally (data not shown). 

Approximately unbiased (AU) topological congruency tests (Shimodaira, 2002) were 
used to verify whether any of the distinct phylogenies resulted in statistically identical 
topologies. First, pair-wise comparisons of single gene phylogenies (Fig. 1) resulted 
in significant p-values (p < 0.05) in all cases, indicating that the different genes have 
not followed identical evolutionary trajectories (see Table S2A). Second, concatenated 
topologies tested against single gene topologies, also resulted in significant p-values in all 
instances (data not shown). Third, pair- wise comparisons of single gene phylogenies to the 
concatenated topologies, revealed that the two longest genes, col and nr28S, resulted in 5 
and 6 significant topological comparisons, respectively (see Table S2B). Despite the rela- 
tively smaller size of nr28S (920 bp) compared to col (1057 bp), nr28S was the only marker 
yielding a statistically identical topology to the fully concatenated topology (ALL Concat). 
The nr28S topology, however, was not identical to the best topology of the concatenated 
alignment excluding the nr28S gene fragment (see ALL except nr28S in Table S2B). Simi- 
larly, pair-wise comparisons of concatenated topologies revealed that significant p-values 
(p < 0.05) were only observed against the ALL except nr28S' topology (Table S2B). 

The variable branch lengths observed in the six phylograms (Fig. 1) are directly 
proportional to the amount of character change; hence the longest branches are indicative 
of increased evolutionary rates of any given Symbiodinium strain. In most cases, increased 
rates of Symbiodinium clades/sub-clades appeared to be gene-specific rather than a 
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ALL Concat 




— P. beii 
G. simplex 

P. glacialis 



Figure 2 Best topology of Symbiodinium based on six concatenated genes. Maximum likelihood (ML) 
topology for Symbiodinium clades and sub-clades A to I based on fully concatenated DNA alignment 
(ALL Concat; 4,703 bp) of all six genes investigated in this study. The Symbiodinium strains within 
each clade/sub-clade are referred using the specific numbers and corresponding ITS2 names in brackets 
(Table 2, Fig. 1). Numbers at nodes represent the ML bootstrap pseudoreplicate (BP) values (underlined 
numbers; 100 BP performed) and Bayesian posterior probabilities (BiPP). Black dots represent nodes 
with 100% BP and BiPP of 1.0. The phylograms were rooted using the dinoflagellates Gymnodinium 
simplex, Pelagodinium beii, and Polarella glacialis. 
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Figure 3 Comparison of relative rates of evolution among Symbiodinium organelles and clades. Plot 
of mean relative rates of evolution (mean ± sem) across the (A) three organelles and (B) nine clades. 
Lower case, italicized letters above the bars represent post hoc THSD tests with significant differences 
between (A) the three organelles and (B) between clades (groups of three bars). Sample sizes are shown 
at the base of each bar, except clade F, where for each bar n — 20. 



character state maintained across all markers. K-scores from relative rate tests were coupled 
with ANOVA to compare the relative rates of evolution among the clades and organelles 
(Fig. 3) examining all clades across the three makers. There was no significant interaction 
of clade and organelle CFi6,i75 = 1-57, p — 0.081), indicating that the pattern of changes 
in rates of evolution among clades were similar across organelles. Overall the general 
pattern of slower relative rates of evolution for some of the basal clades (A, E) and faster 
rates in more derived clades (C, F, H, and I) is held across organelles. However, organelles 
differed in their relative rates of evolution (^2,175 = 248.9, p = 0.0001), driven by rapid 
rates in the chloroplastic and nuclear compartments in comparison to the mitochondrial 
compartment (Fig. 3A), with the most rapid rates found in the chloroplastic markers due 
the high evolutionary rates of clade I and sub-clade D2 (see Figs. 1C and ID). Additionally, 
there was a significant difference between clades (Ps, 175 — 3.87, p = 0.0003) driven by the 
slow rates of clade A, and the rapid rates of clade I (Fig. 3B). 
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DISCUSSION 

Multi-gene analysis supports nr28S as a benchmark lineage 
marker 

Our knowledge of Symbiodinium evolution has historically been constrained by the limited 
number of phylogenetic markers that have been applied to this group. To date, less than 
15 DNA loci have been used to examine Symbiodinium diversity in a phylogenetic context 
(Lajeunesse & Thornhill, 2011; Pochon et al, 2012; Rowan & Powers, 1992; Sampayo, Dove 
& Lajeunesse, 2009; Takabayashi, Santos & Cook, 2004; Takishita etal, 2003; van Oppen et 
al, 2001), and evolutionary relationships among all existing Symbiodinium lineages have 
never been inferred using more than two concatenated genes (Pochon & Gates, 2010). This 
study is the first to perform a multi-gene analysis using six markers representing three 
cellular organelles and integrating biological samples from all known clades and selected 
sub-clades that encompass the genus Symbiodinium. In spite of the overall similarity 
among the trees for each nuclear, chloroplastic and mitochondrial gene (Fig. 1), their 
topologies were statistically different (Table S2). This reflects within and among clade 
differences inherent to the individual markers. Most notably being the unstable positions 
of clades D, E, F5 and H, as well as weak support for among clade relationships observed 
in most markers investigated. Long-branch attraction artifacts (Felsenstein, 1 985) most 
likely accounted for the placement of sub-clade D2 (sample #24) at the root of the tree in 
the chloroplast 23S topology, and for the monophyly of samples #7, 8, 13, and 14 in the 
cob topology. While the markers investigated here are conserved genes that have a priori 
limited utility for finer scale (i.e., within clade) analysis, each contains a unique set of 
characteristics, including variable cladal resolution and/or evolutionary rates (e.g., see 
samples #2 and #3 in col or samples #7, 8, 13, 14 in cob), hence each marker has the 
potential to address different questions. These differences thus support our previous 
conclusion that no one gene fits all of the taxonomic questions being asked in the genus 
Symbiodinium (Pochon etal., 2012). 

Our fully concatenated analysis, incorporating all investigated genes and totaling 
4,703 bp, resulted in a highly resolved phylogeny that was statistically identical to the 
nr28S gene, a gene used as the benchmark for assigning Symbiodinium lineages (Fig. 2; 
Table S2). The fact that the concatenated nuclear, chloroplastic, and mitochondrial genes 
display overall similar evolutionary histories, suggests that the molecular taxonomy of 
the currently recognized Symbiodinium clades using nr28S is robust (Pochon et al, 2006; 
Pochon & Gates, 2010), and that the points of clade differentiation are ancient, allowing for 
a concerted evolution of these conserved genes across genomes. These new results support 
a sequential evolution of Symbiodinium clades A/E/G1-G2/D1-D2/I/B/F2-F5/H/C, from 
most ancestral to most derived, respectively. It appears that there is a level of constraint in 
the system, with recombination likely being a rare event (Santos & Coffroth, 2003; but see 
Chi, Parrow & Dunthorn, 2014), a feature that maintains separation among lineages. 
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Compartment specific evolution and link to environmental 
preference/prevalence 

Dinoflagellates are characterized by several genetic distinguishing features, including large 
genome size, and complex architecture and gene regulation {Barbrook et al, 2010; Hackett 
et al, 2004; Howe, Nisbet & Barbrook, 2008). One prominent feature is the large number 
of genes that have relocated from the ancestral organellar genome to the nucleus, resulting 
in a significant reduction in plastid and mitochondrondrial genomes. For example, the 
few genes that remain in the plastid of peridinin-containing dinoflagellates are primarily 
the core sub units of the photosystem (including cp23S), and the cytochrome b6f and 
ATP synthase complex (about 16 genes including psbA) (Hackett et al, 2004). Similarly, 
the mitochondrial genome of dinoflagellates has been reduced to three protein-coding 
genes {col, colli, and cob), but also contains a large number of non-functional fragments 
separated by repetitive non-coding DNA {Barbrook et al, 2010; Waller & Jackson, 2009). 
Despite the fact that the six Symbiodinium genes investigated here are only a very small 
subset of the Symbiodinium genome, they are physically separated in three cellular 
compartments, each with distinct evolutionary constraints and potential. For example, our 
comparisons of evolutionary rates between markers revealed that the differences among 
cellular compartments was primarily driven by the dissimilarity in the rates of evolution in 
cp23S zndpsbA in Symbiodinium lineages D2 and I (Figs. 1 and 2). 

A possible explanation is that the increased evolutionary rates reflect rarity and 
adaptation to marginal habitats. It has been posited that rare taxa are important in driving 
evolutionary trajectories and innovations {Holt, 1997). Rarity in terms of small population 
size and isolation can drive high rates of adaptation and speciation (e.g., peripheral 
speciation; Mayr, 1963), as mutations in rare species are more likely to accumulate in 
the periphery of the founding population's habitat where rare species may be subjected 
to persistent directional selection in the absence of gene flow, as they colonize new areas 
{Garcia-Ramos & Kirkpatrick, 1997). Such a scenario is supported by the fact that lineages 
D2 and I have only been documented on few occasions {Carlos et al, 1999; Pochon et al, 
2007; Pochon & Gates, 2010), despite numerous Symbiodinium surveys conducted over the 
last 20 years in both the Western Atlantic and Indo-Pacific Oceans targeting a diversity of 
host taxa, as well as free-living communities, and crossing a variety of spatial and temporal 
scales (reviewed in Coffroth & Santos, 2005; Stat, Carter & Hoegh-Guldberg, 2006). In 
addition, Symbiodinium D2 and I have only been detected in the Hawaiian Archipelago and 
Micronesia (Guam and Palau), some of the most isolated island groups in the world and ar- 
eas known for harboring high levels of endemism in marine biodiversity {Hughes, Bellwood 
& Connolly, 2002; Pauley, 2003). Both lineages have been suspected to either be free-living 
because of the manner in which the sample was isolated {Carlos etal, 1999), or recently 
ingested free-living strains due to their apparent rarity in nature {Pochon & Gates, 2010). 

The high rates of evolution in chloroplastic genes in Symbiodinium sub-clade D2 
and clade I might also reflect a relatively recent transition from free-living to symbiotic 
lifestyles. These habitats are extremely different in nature and composition, with free-living 
environments exhibiting high levels of environmental variability and unpredictability, 



Pochon et al. (2014), PeerJ, DO1 1 0.771 7/peerj.394 



14/25 



PeerJ 



while symbiotic habitats are relatively more predictable being spatially constrained and 
influenced by the biology of the host. These environmental differences undoubtedly 
drive the very different morphologies of Symbiodinium found in these two habitats, with 
free-living Symbiodinium flagellated and motile, and symbiotic Symbiodinium encysted 
and immotile. In terms of evolutionary trajectories, such differences in environment must 
exert a profound influence. Symbiodinium strains evolving predominantly in symbiosis 
must have adapted particular biochemical and chloroplastic functions in an environment 
that bears little or no resemblance to a free-living setting. Previous studies on the transition 
between symbiotic and free-living habitat show that changes in evolutionary rate occur 
in bacteria that have transitioned from free-living to a symbiotic lifestyle and mutualism 
(Lutzoni &Pagel, 1997; Moran, 1996). In addition, in some ectomycorrhizal assemblages, 
changes in evolutionary rate correspond to reversing from symbiotic to free-living lifestyle 
(Hibbett, Gilbert & Donoghue, 2000). Further, rapid and extreme environmental changes 
may favor the survival of rare and transitioning species, as their existing phenotypic 
diversity may contain traits pre-adapted to a changing environment (Holt, 1997). 

Our examination of evolutionary rates for multiple markers and organelles provides 
an opportunity to begin addressing the implications for gene and genome evolution 
due to symbiotic lifestyle and dissimilarities in organellar genome constraints. Here we 
see the significantly slower evolutionary rates of Symbiodinium clade A compared to 
other clades as well as overall slower relative rates of the mitochondrial compartment 
across all clades (Fig. 3). As recently highlighted by Decelle (2013 ), the predominance of 
a particular lifestyle in marine microalgae (i.e., symbiotic versus free-living) can have 
important implications in genome evolution. Symbiodinium clade A is a basal lineage 
known to date back to at least 50 MYA and which has possibly survived through the 
climatic vicissitudes of the K-T boundary (Tchernov et at, 2004; Pochon et at, 2006). 
This clade easily overgrows other Symbiodinium clades in culture (Carlos et at, 1999) 
and shows attributes of parasitism in scleractinian corals (Stat, Morris & Gates, 2008). 
Additionally, clade A contains a high number of unique strains that may never establish 
symbiotic relationships (e.g., Coffroth et at, 2006; Hirose et at, 2008; Yamashita & Koike, 
2013), and evolves at a similar rate to its close pelagic dinoflagellate relatives, contrasting 
with all other Symbiodinium clades which on average evolve six times faster based on 
nrl8S sequence analyses (Shaked & de Vargas, 2006). As discussed by Decelle (2013) 
these differential traits and pressures of clade A, such as prevalence in the free-living 
environment with an occasional symbiotic lifestage (i.e., planktonic symbionts) provide a 
situation where the genomes are primarily influenced by external environmental pressures 
rather than host controlled traits. The resulting pressures are more likely to establish sexual 
exchanges within larger free-living populations, minimizing genomic impacts with often 
comparatively slower rates of evolution. In contrast, lineages that spend most of their 
lifecycle in hospite, which is arguably the case for most Symbiodinium clades (Table 1), tend 
to develop a certain dependence on the host which can lead to comparatively higher rates 
of change due to genome reduction and higher genetic drift associated with the absence of 
purifying selection through sexual recombination (Lynch, Koskella & Schaack, 2006). 
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Our analysis of relative-rates of evolution also indicated that mitochondrial genes 
evolved approximately twice slower than nuclear and chloroplastic genes. This result 
appears to contrast markedly with the recent study of Roy-Smith & Keeling (2012 ), which 
showed that silent site divergence of the mitochondrial genome in other protists with 
secondary red-algal derived plastids evolve 5-30 times faster than the divergence of their 
plastid genomes. These contrasting results may in part be due to the differences in DNA 
bases of a few selected genes in our study in comparison to the silent site divergence of 
complete mitochondrial and plastid genomes in Roy-Smith & Keeling (2012). Nevertheless, 
as there is evidence that our results from a subset of genes matches those of land plants 
and green algae, with more rapid rates of divergence in the plastid organelle, additional 
work is needed to further explore the implications of transitions between the free-living 
and symbiotic state for Symbiodinium, with a goal of gaining a more comprehensive 
understanding of the dynamics and mechanisms behind the different evolutionary 
trajectories observed in this study. Additionally, the increasing use of next- generation 
sequencing for characterizing entire Symbiodinium genomes (e.g., Barbrook, Voolstra 
& Howe, 2014) is an exciting avenue that provides unprecedented opportunities for 
the investigation of novel markers and paves the way for much more comprehensive 
phylogenomics studies to come. 

CONCLUSIONS 

Our study examines the performance of six genetic markers from three organelles in sam- 
ples representing all currently documented lineages of Symbiodinium. As such it represents 
a comprehensive phylogenetic reconstruction of Symbiodinium, and highlights differences 
in the taxonomic resolution of each marker and their relative value in addressing a variety 
of evolutionary questions. Our series of phylogenetic analyses were conducted to address 
three working hypotheses. Despite striking similarities among the single gene phylogenies 
from distinct cellular compartments, none were evolutionarily identical confirming our 
first hypothesis. This result reflected within and among clade differences inherent to the 
individual markers. Our second hypothesis, however, was rejected and showed that a 
supermatrix tree incorporating all investigated genes (4,703 bp alignment) resulted in 
a highly resolved phylogeny that was statistically identical to the nr28S gene. This result 
provides additional support for the use of nr28S as a 'clade-leveF benchmark gene for 
Symbiodinium. Finally, compartment-specific differences in evolutionary rates among 
Symbiodinium clade and gene organelle were revealed confirming our third hypothesis. 
Highest evolutionary rates were observed within the chloroplastic compartment, a 
pattern that was largely driven by fast evolving Symbiodinium clades D2 and I, two 
lineages that are rare in nature and which may be transitioning between free-living and 
symbiotic states. As such, rarity appears to associate with evolutionary innovation in a key 
functional compartment in Symbiodinium. The identification of different evolutionary 
trajectories in chloroplast genes that link with habitat and prevalence suggests that this 
organellar compartment is evolutionarily plastic and responsive. This finding may have 
important implications for our understanding of evolutionary processes that underpin a 
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symbiotic lifestyle in this essential dinoflagellate group. Our analysis further revealed that 
investigated mitochondrial genes evolved approximately twice slower than nuclear and 
chloroplastic genes, an observation that contrasts with comparatively fast mitochondrial 
rates previously documented in non-symbiotic protists with secondary red-algal 
derived plastids. Together these results further highlight the need for deeper genome 
sequencing for a variety of Symbiodinium taxa with rapidly advancing next-generation 
sequencing approaches to understand the evolution of these enigmatic yet critical 
symbionts. 
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